home *** CD-ROM | disk | FTP | other *** search
Text File | 1991-06-23 | 51.3 KB | 1,682 lines |
- Newsgroups: comp.sources.misc
- From: Richard Goerwitz <goer@sophist.uchicago.edu>
- Subject: v20i063: jargon - jargon browser, Part01/01
- Message-ID: <1991Jun20.033537.10417@sparky.IMD.Sterling.COM>
- X-Md4-Signature: 3ce443f32cac4cb8f5c261e28f8ef252
- Date: Thu, 20 Jun 1991 03:35:37 GMT
- Approved: kent@sparky.imd.sterling.com
-
- Submitted-by: Richard Goerwitz <goer@sophist.uchicago.edu>
- Posting-number: Volume 20, Issue 63
- Archive-name: jargon/part01
- Environment: Icon
-
- Program name: jargon
- Source language: icon
- Purpose: quickly find entries in the hackers' jargon file
-
- This shell archive contains a jargon database program, aptly enough
- named "jargon." If you have version 2.7.1 (posted March 1, 1991 to
- alt. sources) or version 2.8.[1-3] (ftp from pit-manager.mit.edu,
- pub/jargon/jargon2.8.3.Z) of the hackers' jargon file, you can use
- this package for quick access to entries within that file. Just type
- "jargon word" (where "word" is the bit of hackers' jargon you want a
- definition for). If you're not sure what's in the database, type
- "jargon -p pattern" (where pattern is an egrep-style regular
- expression). Jargon will print a list of all entries containing
- pattern to the standard output. Uppercase letters are folded into
- their lowercase equivalents, incidentally, in order to provide enough
- latitude for all the different capitalization practices.
-
- Richard Goerwitz (goer@sophist.uchicago.edu)
- ---
- #! /bin/sh
- # This is a shell archive. Remove anything before this line, then feed it
- # into a shell via "sh file" or similar. To overwrite existing files,
- # type "sh file -c".
- # The tool that generated this appeared in the comp.sources.unix newsgroup;
- # send mail to comp-sources-unix@uunet.uu.net if you want that tool.
- # Contents: README Makefile.dist adjuncts.icn findre.icn getkeys.icn
- # gettext.icn idxtext.icn jarg2get.icn jargon.src
- # Wrapped by kent@sparky on Wed Jun 19 22:17:50 1991
- PATH=/bin:/usr/bin:/usr/ucb ; export PATH
- echo If this archive is complete, you will see the following message:
- echo ' "shar: End of archive 1 (of 1)."'
- if test -f 'README' -a "${1}" != "-c" ; then
- echo shar: Will not clobber existing file \"'README'\"
- else
- echo shar: Extracting \"'README'\" \(4444 characters\)
- sed "s/^X//" >'README' <<'END_OF_FILE'
- X-------
- X
- XProgram name: jargon
- XSource language: icon
- XPurpose: quickly find entries in the hackers' jargon file
- X
- X-------
- X
- XDescription:
- X
- XThis shell archive contains a jargon database program, aptly enough
- Xnamed "jargon." If you have version 2.7.1 (posted March 1, 1991 to
- Xalt. sources) or version 2.8.[1-3] (ftp from pit-manager.mit.edu,
- Xpub/jargon/jargon2.8.3.Z) of the hackers' jargon file, you can use
- Xthis package for quick access to entries within that file. Just type
- X"jargon word" (where "word" is the bit of hackers' jargon you want a
- Xdefinition for). If you're not sure what's in the database, type
- X"jargon -p pattern" (where pattern is an egrep-style regular
- Xexpression). Jargon will print a list of all entries containing
- Xpattern to the standard output. Uppercase letters are folded into
- Xtheir lowercase equivalents, incidentally, in order to provide enough
- Xlatitude for all the different capitalization practices.
- X
- X-------
- X
- XInstallation:
- X
- XCp Makefile.dist to Makefile and edit it to reflect local file
- Xstructure and ownership conventions. After editing, make all. If,
- Xafter seeing a lot of garbage go by, you get the message "everything
- Xseems OK," then su root and make install. If you don't have root
- Xprivileges, you must change the DESTDIR and LIBDIR variables in the
- Xmakefile so that they reflect directories you have access to.
- X
- XIf you have an Icon implementation that doesn't support expandable
- Xregions, then you may need to adjust QLSIZE and HEAPSIZE (on which,
- Xsee the icont man page).
- X
- XIf you are operating on a non-Unix platform, you'll need to edit the
- Xfile "jargon.src" so that the variable "database" is set to the full
- Xpath of the jargon.wrd file (i.e. $(LIBDIR)/jargon.wrd in the
- Xmakefile). Then type:
- X
- X icont jarg2get.icn
- X icont -o idxtext idxtext.icn adjuncts.icn
- X copy jargon.src jargon.icn
- X icont -o jargon jargon.icn gettext.icn adjuncts.icn \
- X getkeys.icn findre.icn
- X (iconx) jarg2get < JARGONFILE > jargon.wrd
- X (iconx) idxtext jargon.wrd
- X
- Xwhere "copy" is your system's file copy or rename command, and
- XJARGONFILE is the name of the original jargon.ascii file. Note that
- Xthe backslash above merely indicates that the next line should be
- Xtyped together with the current line. It is not to be entered
- Xliterally. On some systems, it may be necessary to type "iconx"
- Xin as indicated above in parentheses (in some cases, "ficonx").
- X
- XTo test a non-UNIX installation, type
- X
- X (iconx) jargon zork
- X
- XAgain, "iconx" may not be necessary. If you get a definition for
- X"zork," then everything is probably OK. If you altered jargon.src so
- Xthat the database variable points to a jargon.wrd file in the current
- Xdirectory, then there is no need to do anything more. If you named a
- Xdatabase file in a different directory than the current one, then you
- Xshould copy the jargon.wrd file to that directory, cd to that direc-
- Xtory, then type
- X
- X (iconx) idxtext DATABASE
- X
- Xwhere DATABASE gives the full path of your jargon.wrd file. Running
- Xthis command may take a minute, so be patient.
- X
- XMS-DOS users: Your small address space and segmented architecture
- Xmake it rough to install jargon. If you are determined, then build
- Xthe files somewhere else. Rename the .IDX file that is created dur-
- Xing this process as "argonwrd.idx." Download this and the remaining
- Xfiles to your PC. All you need to do on the PC is edit jargon.src
- Xto reflect where you want your jargon.wrd file to go, copy jargon.src
- Xto jargon.icn, then type
- X
- X icont -o jargon jargon.icn gettext.icn adjuncts.icn \
- X getkeys.icn findre.icn
- X
- XI'd expect the same procedure to work for other micros, though I've
- Xonly tried it for PCs running MS-DOS.
- X
- X-------
- X
- XAdmission:
- X
- XJargon is really just a cheap trick I use to get people to test a
- Xsmall text retrieval package. This package is fully documented in the
- Xfiles
- X
- X gettext.icn
- X getkeys.icn
- X adjuncts.icn
- X idxtext.icn
- X
- XIcon programmers will find gettext a nice, easy way to access
- Xkey/value combinations from a file (rather than from a hash table in
- Xmemory). Often the practicalities of memory usage lead one to avoid
- Xhuge in-core hash tables. The gettext package provides a small-scale,
- Xbut fairly efficient, alternative.
- X
- X-------
- X
- XProblems:
- X
- XThis program works fine on a Xenix/386 box. Although I've heard many
- Xreports of it working elsewhere, your mileage may naturally vary. If
- Xthere are problems, I'd like to hear about them. Send mail to:
- X
- XRichard Goerwitz (goer@sophist.uchicago.edu)
- X
- END_OF_FILE
- if test 4444 -ne `wc -c <'README'`; then
- echo shar: \"'README'\" unpacked with wrong size!
- fi
- # end of 'README'
- fi
- if test -f 'Makefile.dist' -a "${1}" != "-c" ; then
- echo shar: Will not clobber existing file \"'Makefile.dist'\"
- else
- echo shar: Extracting \"'Makefile.dist'\" \(2074 characters\)
- sed "s/^X//" >'Makefile.dist' <<'END_OF_FILE'
- X# Don't change this unless you're absolutely sure of what you're doing.
- XPROGNAME = jargon
- X
- X# You may need to change these.
- XICONC = /usr/icon/v8/bin/icont
- XJARGONFILE = ./jargon2.8.3
- X
- X# Please edit these to reflect your local file structure & conventions.
- X# If you are running on a non-UNIX installation, see below at SRC1.
- XDESTDIR = /usr/local/bin
- XLIBDIR = /usr/local/lib/$(PROGNAME)
- XOWNER = root # bin
- XGROUP = root # bin
- X
- X# I hope you won't have to use this.
- XDEBUGFLAG = #-t
- X
- XSHELL = /bin/sh
- X# Source files for $(PROGNAME). Uncomment NONUNIX for non-Unix
- X# installations.
- X# NONUNIX = getkeys.icn findre.icn
- XSRC1 = $(PROGNAME).icn gettext.icn adjuncts.icn $(NONUNIX)
- X
- Xall: jargon.wrd $(PROGNAME)
- X @echo "\nEverything seems OK. Go ahead & install.\n"
- X
- Xjargon.wrd: jarg2get idxtext
- X test -f $(JARGONFILE)
- X ./jarg2get < $(JARGONFILE) > jargon.wrd
- X @echo "\nThis may take a few minutes:\n"
- X ./idxtext jargon.wrd
- X
- X$(PROGNAME): $(SRC1)
- X $(ICONC) $(DEBUGFLAG) -o $@ $(SRC1)
- X
- Xidxtext: idxtext.icn adjuncts.icn
- X $(ICONC) $(DEBUGFLAG) -o $@ idxtext.icn adjuncts.icn
- X
- Xjarg2get: jarg2get.icn
- X $(ICONC) $(DEBUGFLAG) -o $@ $?
- X
- X
- X# Set pathnames to their correct values for this system.
- X$(PROGNAME).icn:
- X sed "s|/usr/local/lib/$(PROGNAME)/jargon.wrd|$(LIBDIR)/jargon.wrd|g" < jargon.src > jargon.icn
- X
- X# Pessimistic assumptions regarding the environment (in particular,
- X# I don't assume you have the BSD "install" shell script).
- Xinstall: all
- X test -d $(DESTDIR) || (mkdir $(DESTDIR) && chmod 755 $(DESTDIR))
- X cp $(PROGNAME) $(DESTDIR)/
- X chgrp $(GROUP) $(DESTDIR)/$(PROGNAME)
- X chown $(OWNER) $(DESTDIR)/$(PROGNAME)
- X test -d $(LIBDIR) || (mkdir $(LIBDIR) && chmod 755 $(LIBDIR))
- X cp jargon.wrd $(LIBDIR)/
- X chgrp $(GROUP) $(LIBDIR)/jargon.wrd
- X chown $(OWNER) $(LIBDIR)/jargon.wrd
- X @echo "\nThis may take a few minutes:\n"
- X ./idxtext $(LIBDIR)/jargon.wrd
- X chgrp $(GROUP) $(LIBDIR)/*IDX
- X chown $(OWNER) $(LIBDIR)/*IDX
- X ./$(PROGNAME) zork
- X @echo "\nEverything checks out OK. Installation done.\n"
- X
- Xclean:
- X -rm -f core *~ .u? *.IDX $(PROGNAME)
- X
- Xclobber: clean
- X -rm -f $(PROGNAME).icn jargon.wrd
- X
- END_OF_FILE
- if test 2074 -ne `wc -c <'Makefile.dist'`; then
- echo shar: \"'Makefile.dist'\" unpacked with wrong size!
- fi
- # end of 'Makefile.dist'
- fi
- if test -f 'adjuncts.icn' -a "${1}" != "-c" ; then
- echo shar: Will not clobber existing file \"'adjuncts.icn'\"
- else
- echo shar: Extracting \"'adjuncts.icn'\" \(1335 characters\)
- sed "s/^X//" >'adjuncts.icn' <<'END_OF_FILE'
- X############################################################################
- X#
- X# Name: adjuncts.icn
- X#
- X# Title: adjuncts (adjunct utilities for gettext and idxtext)
- X#
- X# Author: Richard L. Goerwitz
- X#
- X# Version: 1.2
- X#
- X############################################################################
- X#
- X# Pretty mundane stuff. Basename(), Pathname(), Strip(), and a utility
- X# for creating index filenames.
- X#
- X############################################################################
- X#
- X# Links: none
- X#
- X# See also: gettext.icn, idxtext,icn
- X#
- X############################################################################
- X
- X
- Xglobal _slash, _baselen
- X
- Xprocedure Basename(s)
- X
- X # global _slash
- X s ? {
- X while tab(find(_slash)+1)
- X return tab(0)
- X }
- X
- Xend
- X
- X
- Xprocedure Pathname(s)
- X
- X # global _slash
- X s2 := ""
- X s ? {
- X while s2 ||:= tab(find(_slash)+1)
- X return s2
- X }
- X
- Xend
- X
- X
- Xprocedure getidxname(FNAME)
- X
- X #
- X # Discard path component. Cut basename down to a small enough
- X # size that the OS will be able to handle addition of the ex-
- X # tension ".IDX"
- X #
- X
- X # global _slash, _baselen
- X return right(Strip(Basename(FNAME,_slash),'.'), _baselen, "x") || ".IDX"
- X
- Xend
- X
- X
- Xprocedure Strip(s,c)
- X
- X local s2
- X
- X s2 := ""
- X s ? {
- X while s2 ||:= tab(upto(c))
- X do tab(many(c))
- X s2 ||:= tab(0)
- X }
- X return s2
- X
- Xend
- END_OF_FILE
- if test 1335 -ne `wc -c <'adjuncts.icn'`; then
- echo shar: \"'adjuncts.icn'\" unpacked with wrong size!
- fi
- # end of 'adjuncts.icn'
- fi
- if test -f 'findre.icn' -a "${1}" != "-c" ; then
- echo shar: Will not clobber existing file \"'findre.icn'\"
- else
- echo shar: Extracting \"'findre.icn'\" \(20702 characters\)
- sed "s/^X//" >'findre.icn' <<'END_OF_FILE'
- X########################################################################
- X#
- X# Name: findre.icn
- X#
- X# Title: "Find" Regular Expression
- X#
- X# Author: Richard L. Goerwitz
- X#
- X# Version: 1.14
- X#
- X########################################################################
- X#
- X# I place this and any later versions in the public domain - RLG.
- X#
- X########################################################################
- X#
- X# DESCRIPTION: findre() is like the Icon builtin function find(),
- X# except that it takes, as its first argument, a regular expression
- X# pretty much like the ones the Unix egrep command uses (the few
- X# minor differences are listed below). Its syntax is the same as
- X# find's (i.e. findre(s1,s2,i,j)), with the exception that a no-
- X# argument invocation wipes out all static structures utilized by
- X# findre, and then forces a garbage collection.
- X#
- X# (For those not familiar with regular expressions and the Unix egrep
- X# command: findre() offers a simple and compact wildcard-based search
- X# system. If you do a lot of searches through text files, or write
- X# programs which do searches based on user input, then findre is a
- X# utility you might want to look over.)
- X#
- X# IMPORTANT DIFFERENCES between find and findre: As noted above,
- X# findre() is just a find() function that takes a regular expression
- X# as its first argument. One major problem with this setup is that
- X# it leaves the user with no easy way to tab past a matched
- X# substring, as with
- X#
- X# s ? write(tab(find("hello")+5))
- X#
- X# In order to remedy this intrinsic deficiency, findre() sets the
- X# global variable __endpoint to the first position after any given
- X# match occurs. Use this variable with great care, preferably
- X# assigning its value to some other variable immediately after the
- X# match (for example, findre("hello [.?!]*",s) & tmp := __endpoint).
- X# Otherwise, you will certainly run into trouble. (See the example
- X# below for an illustration of how __endpoint is used).
- X#
- X# IMPORTANT DIFFERENCES between egrep and findre: findre utilizes
- X# the same basic language as egrep. The only big difference is that
- X# findre uses intrinsic Icon data structures and escaping conven-
- X# tions rather than those of any particular Unix variant. Be care-
- X# ful! If you put findre("\(hello\)",s) into your source file,
- X# findre will treat it just like findre("(hello)",s). If, however,
- X# you enter '\(hello\)' at run-time (via, say, findre(!&input,s)),
- X# what Icon receives will depend on your operating system (most
- X# likely, a trace will show "\\(hello\\)").
- X#
- X# BUGS: Space has essentially been conserved at the expense of time
- X# in the automata produced by findre(). The algorithm, in other
- X# words, will produce the equivalent of a pushdown automaton under
- X# certain circumstances, rather than strive (at the expense of space)
- X# for full determinism. I tried to make up a nfa -> dfa converter
- X# that would only create that portion of the dfa it needed to accept
- X# or reject a string, but the resulting automaton was actually quite
- X# slow (if anyone can think of a way to do this in Icon, and keep it
- X# small and fast, please let us all know about it). Note that under
- X# version 8 of Icon, findre takes up negligible storage space, due to
- X# the much improved hashing algorithm. I have not tested it under
- X# version 7, but I would expect it to use up quite a bit more space
- X# in that environment.
- X#
- X# IMPORTANT NOTE: Findre takes a shortest-possible-match approach
- X# to regular expressions. In other words, if you look for "a*",
- X# findre will not even bother looking for an "a." It will just match
- X# the empty string. Without this feature, findre would perform a bit
- X# more slowly. The problem with such an approach is that often the
- X# user will want to tab past the longest possible string of matched
- X# characters (say tab((findre("a*|b*"), __endpoint)). In circumstan-
- X# ces like this, please just use something like:
- X#
- X# s ? {
- X# tab(find("a")) & # or use Arb() from the IPL (patterns.icn)
- X# tab(many('a'))
- X# tab(many('b'))
- X# }
- X#
- X# or else use some combination of findre and the above.
- X#
- X########################################################################
- X#
- X# REGULAR EXPRESSION SYNTAX: Regular expression syntax is complex,
- X# and yet simple. It is simple in the sense that most of its power
- X# is concentrated in about a dozen easy-to-learn symbols. It is
- X# complex in the sense that, by combining these symbols with
- X# characters, you can represent very intricate patterns.
- X#
- X# I make no pretense here of offering a full explanation of regular
- X# expressions, their usage, and the deeper nuances of their syntax.
- X# As noted above, this should be gleaned from a Unix manual. For
- X# quick reference, however, I have included a brief summary of all
- X# the special symbols used, accompanied by an explanation of what
- X# they mean, and, in some cases, of how they are used (most of this
- X# is taken from the comments prepended to Jerry Nowlin's Icon-grep
- X# command, as posted a couple of years ago):
- X#
- X# ^ - matches if the following pattern is at the beginning
- X# of a line (i.e. ^# matches lines beginning with "#")
- X# $ - matches if the preceding pattern is at the end of a line
- X# . - matches any single character
- X# + - matches from 1 to any number of occurrences of the
- X# previous expression (i.e. a character, or set of paren-
- X# thesized/bracketed characters)
- X# * - matches from 0 to any number of occurrences of the previous
- X# expression
- X# \ - removes the special meaning of any special characters
- X# recognized by this program (i.e if you want to match lines
- X# beginning with a "[", write ^\[, and not ^[)
- X# | - matches either the pattern before it, or the one after
- X# it (i.e. abc|cde matches either abc or cde)
- X# [] - matches any member of the enclosed character set, or,
- X# if ^ is the first character, any nonmember of the
- X# enclosed character set (i.e. [^ab] matches any character
- X# _except_ a and b).
- X# () - used for grouping (e.g. ^(abc|cde)$ matches lines consist-
- X# ing of either "abc" or "cde," while ^abc|cde$ matches
- X# lines either beginning with "abc" or ending in "cde")
- X#
- X#########################################################################
- X#
- X# EXAMPLE program:
- X#
- X# procedure main(a)
- X# while line := !&input do {
- X# token_list := tokenize_line(line,a[1])
- X# every write(!token_list)
- X# }
- X# end
- X#
- X# procedure tokenize_line(s,sep)
- X# tmp_lst := []
- X# s ? {
- X# while field := tab(findre(sep)|0) &
- X# mark := __endpoint
- X# do {
- X# put(tmp_lst,"" ~== field)
- X# if pos(0) then break
- X# else tab(mark)
- X# }
- X# }
- X# return tmp_lst
- X# end
- X#
- X# The above program would be compiled with findre (e.g. "icont
- X# test_prg.icn findre.icn") to produce a single executable which
- X# tokenizes each line of input based on a user-specified delimiter.
- X# Note how __endpoint is set soon after findre() succeeds. Note
- X# also how empty fields are excluded with "" ~==, etc. Finally, note
- X# that the temporary list, tmp_lst, is not needed. It is included
- X# here merely to illustrate one way in which tokens might be stored.
- X#
- X# Tokenizing is, of course, only one of many uses one might put
- X# findre to. It is very helpful in allowing the user to construct
- X# automata at run-time. If, say, you want to write a program that
- X# searches text files for patterns given by the user, findre would be
- X# a perfect utility to use. Findre in general permits more compact
- X# expression of patterns than one can obtain using intrinsic Icon
- X# scanning facilities. Its near complete compatibility with the Unix
- X# regexp library, moreover, makes for greater ease of porting,
- X# especially in cases where Icon is being used to prototype C code.
- X#
- X#########################################################################
- X
- X
- Xglobal state_table, parends_present, slash_present
- Xglobal biggest_nonmeta_str, __endpoint
- Xrecord o_a_s(op,arg,state)
- X
- X
- Xprocedure findre(re, s, i, j)
- X
- X local p, x, nonmeta_len
- X static FSTN_table, STRING_table
- X initial {
- X FSTN_table := table()
- X STRING_table := table()
- X }
- X
- X if /re then {
- X FSTN_table := table()
- X STRING_table := table()
- X collect() # do it *now*
- X return
- X }
- X
- X /s := &subject
- X if \i then {
- X if i < 1 then
- X i := *s + (i+1)
- X }
- X else i := \&pos | 1
- X if \j then {
- X if j < 1 then
- X j := *s + (j+1)
- X }
- X
- X else j := *s+1
- X if /FSTN_table[re] then {
- X # If we haven't seen this re before, then...
- X if \STRING_table[re] then {
- X # ...if it's in the STRING_table, use plain find()
- X every p := find(STRING_table[re],s,i,j)
- X do { __endpoint := p + *STRING_table[re]; suspend p }
- X fail
- X }
- X else {
- X # However, if it's not in the string table, we have to
- X # tokenize it and check for metacharacters. If it has
- X # metas, we create an FSTN, and put that into FSTN_table;
- X # otherwise, we just put it into the STRING_table.
- X tokenized_re := tokenize(re)
- X if 0 > !tokenized_re then {
- X # if at least one element is < 0, re has metas
- X MakeFSTN(tokenized_re) | err_out(re,2)
- X # both biggest_nonmeta_str and state_table are global
- X /FSTN_table[re] := [.biggest_nonmeta_str, copy(state_table)]
- X }
- X else {
- X # re has no metas; put the input string into STRING_table
- X # for future reference, and execute find() at once
- X tmp := ""; every tmp ||:= char(!tokenized_re)
- X insert(STRING_table,re,tmp)
- X every p := find(STRING_table[re],s,i,j)
- X do { __endpoint := p + *STRING_table[re]; suspend p }
- X fail
- X }
- X }
- X }
- X
- X
- X if nonmeta_len := (1 < *FSTN_table[re][1]) then {
- X # If the biggest non-meta string in the original re
- X # was more than 1, then put in a check for it...
- X s[1:j] ? {
- X tab(x := i to j - nonmeta_len) &
- X (find(FSTN_table[re][1]) | fail) \ 1 &
- X (__endpoint := apply_FSTN(&null,FSTN_table[re][2])) &
- X (suspend x)
- X }
- X }
- X else {
- X #...otherwise it's not worth worrying about the biggest nonmeta str
- X s[1:j] ? {
- X tab(x := i to j) &
- X (__endpoint := apply_FSTN(&null,FSTN_table[re][2])) &
- X (suspend x)
- X }
- X }
- X
- Xend
- X
- X
- X
- Xprocedure apply_FSTN(ini,tbl)
- X
- X static s_tbl
- X local POS, tmp, fin
- X
- X /ini := 1 & s_tbl := tbl & biggest_pos := 1
- X if ini = 0 then {
- X return &pos
- X }
- X POS := &pos
- X fin := 0
- X
- X repeat {
- X if tmp := !s_tbl[ini] &
- X tab(tmp.op(tmp.arg))
- X then {
- X if tmp.state = fin
- X then return &pos
- X else ini := tmp.state
- X }
- X else (&pos := POS, fail)
- X }
- X
- Xend
- X
- X
- X
- Xprocedure tokenize(s)
- X
- X local chr, tmp
- X
- X token_list := list()
- X s ? {
- X tab(many('*+?|'))
- X while chr := move(1) do {
- X if chr == "\\"
- X # it can't be a metacharacter; remove the \ and "put"
- X # the integer value of the next chr into token_list
- X then put(token_list,ord(move(1))) | err_out(s,2,chr)
- X else if any('*+()|?.$^',chr)
- X then {
- X # Yuck! Egrep compatibility stuff.
- X case chr of {
- X "*" : {
- X tab(many('*+?'))
- X put(token_list,-ord("*"))
- X }
- X "+" : {
- X tmp := tab(many('*?+')) | &null
- X if upto('*?',\tmp)
- X then put(token_list,-ord("*"))
- X else put(token_list,-ord("+"))
- X }
- X "?" : {
- X tmp := tab(many('*?+')) | &null
- X if upto('*+',\tmp)
- X then put(token_list,-ord("*"))
- X else put(token_list,-ord("?"))
- X }
- X "(" : {
- X tab(many('*+?'))
- X put(token_list,-ord("("))
- X }
- X default: {
- X put(token_list,-ord(chr))
- X }
- X }
- X }
- X else {
- X case chr of {
- X # More egrep compatibility stuff.
- X "[" : {
- X b_loc := find("[") | *&subject+1
- X every next_one := find("]",,,b_loc)
- X \next_one ~= &pos | err_out(s,2,chr)
- X put(token_list,-ord(chr))
- X }
- X "]" : {
- X if &pos = (\next_one+1)
- X then put(token_list,-ord(chr)) &
- X next_one := &null
- X else put(token_list,ord(chr))
- X }
- X default: put(token_list,ord(chr))
- X }
- X }
- X }
- X }
- X
- X token_list := UnMetaBrackets(token_list)
- X
- X fixed_length_token_list := list(*token_list)
- X every i := 1 to *token_list
- X do fixed_length_token_list[i] := token_list[i]
- X return fixed_length_token_list
- X
- Xend
- X
- X
- X
- Xprocedure UnMetaBrackets(l)
- X
- X # Since brackets delineate a cset, it doesn't make
- X # any sense to have metacharacters inside of them.
- X # UnMetaBrackets makes sure there are no metacharac-
- X # ters inside of the braces.
- X
- X local tmplst, i, Lb, Rb
- X
- X tmplst := list(); i := 0
- X Lb := -ord("[")
- X Rb := -ord("]")
- X
- X while (i +:= 1) <= *l do {
- X if l[i] = Lb then {
- X put(tmplst,l[i])
- X until l[i +:= 1] = Rb
- X do put(tmplst,abs(l[i]))
- X put(tmplst,l[i])
- X }
- X else put(tmplst,l[i])
- X }
- X return tmplst
- X
- Xend
- X
- X
- X
- Xprocedure MakeFSTN(l,INI,FIN)
- X
- X # MakeFSTN recursively descends through the tree structure
- X # implied by the tokenized string, l, recording in (global)
- X # fstn_table a list of operations to be performed, and the
- X # initial and final states which apply to them.
- X
- X # global biggest_nonmeta_str, slash_present, parends_present
- X static Lp, Rp, Sl, Lb, Rb, Caret_inside, Dot, Dollar, Caret_outside
- X local i, inter, inter2, tmp
- X initial {
- X Lp := -ord("("); Rp := -ord(")")
- X Sl := -ord("|")
- X Lb := -ord("["); Rb := -ord("]"); Caret_inside := ord("^")
- X Dot := -ord("."); Dollar := -ord("$"); Caret_outside := -ord("^")
- X }
- X
- X /INI := 1 & state_table := table() &
- X NextState("new") & biggest_nonmeta_str := ""
- X /FIN := 0
- X
- X # I haven't bothered to test for empty lists everywhere.
- X if *l = 0 then {
- X /state_table[INI] := []
- X put(state_table[INI],o_a_s(zSucceed,&null,FIN))
- X return
- X }
- X
- X # HUNT DOWN THE SLASH (ALTERNATION OPERATOR)
- X every i := 1 to *l do {
- X if l[i] = Sl & tab_bal(l,Lp,Rp) = i then {
- X if i = 1 then err_out(l,2,char(abs(l[i]))) else {
- X /slash_present := "yes"
- X inter := NextState()
- X inter2:= NextState()
- X MakeFSTN(l[1:i],inter2,FIN)
- X MakeFSTN(l[i+1:0],inter,FIN)
- X /state_table[INI] := []
- X put(state_table[INI],o_a_s(apply_FSTN,inter2,0))
- X put(state_table[INI],o_a_s(apply_FSTN,inter,0))
- X return
- X }
- X }
- X }
- X
- X # HUNT DOWN PARENTHESES
- X if l[1] = Lp then {
- X i := tab_bal(l,Lp,Rp) | err_out(l,2,"(")
- X inter := NextState()
- X if any('*+?',char(abs(0 > l[i+1]))) then {
- X case l[i+1] of {
- X -ord("*") : {
- X /state_table[INI] := []
- X put(state_table[INI],o_a_s(apply_FSTN,inter,0))
- X MakeFSTN(l[2:i],INI,INI)
- X MakeFSTN(l[i+2:0],inter,FIN)
- X return
- X }
- X -ord("+") : {
- X inter2 := NextState()
- X /state_table[inter2] := []
- X MakeFSTN(l[2:i],INI,inter2)
- X put(state_table[inter2],o_a_s(apply_FSTN,inter,0))
- X MakeFSTN(l[2:i],inter2,inter2)
- X MakeFSTN(l[i+2:0],inter,FIN)
- X return
- X }
- X -ord("?") : {
- X /state_table[INI] := []
- X put(state_table[INI],o_a_s(apply_FSTN,inter,0))
- X MakeFSTN(l[2:i],INI,inter)
- X MakeFSTN(l[i+2:0],inter,FIN)
- X return
- X }
- X }
- X }
- X else {
- X MakeFSTN(l[2:i],INI,inter)
- X MakeFSTN(l[i+1:0],inter,FIN)
- X return
- X }
- X }
- X else { # I.E. l[1] NOT = Lp (left parenthesis as -ord("("))
- X every i := 1 to *l do {
- X case l[i] of {
- X Lp : {
- X inter := NextState()
- X MakeFSTN(l[1:i],INI,inter)
- X /parends_present := "yes"
- X MakeFSTN(l[i:0],inter,FIN)
- X return
- X }
- X Rp : err_out(l,2,")")
- X }
- X }
- X }
- X
- X # NOW, HUNT DOWN BRACKETS
- X if l[1] = Lb then {
- X i := tab_bal(l,Lb,Rb) | err_out(l,2,"[")
- X inter := NextState()
- X tmp := ""; every tmp ||:= char(l[2 to i-1])
- X if Caret_inside = l[2]
- X then tmp := ~cset(Expand(tmp[2:0]))
- X else tmp := cset(Expand(tmp))
- X if any('*+?',char(abs(0 > l[i+1]))) then {
- X case l[i+1] of {
- X -ord("*") : {
- X /state_table[INI] := []
- X put(state_table[INI],o_a_s(apply_FSTN,inter,0))
- X put(state_table[INI],o_a_s(any,tmp,INI))
- X MakeFSTN(l[i+2:0],inter,FIN)
- X return
- X }
- X -ord("+") : {
- X inter2 := NextState()
- X /state_table[INI] := []
- X put(state_table[INI],o_a_s(any,tmp,inter2))
- X /state_table[inter2] := []
- X put(state_table[inter2],o_a_s(apply_FSTN,inter,0))
- X put(state_table[inter2],o_a_s(any,tmp,inter2))
- X MakeFSTN(l[i+2:0],inter,FIN)
- X return
- X }
- X -ord("?") : {
- X /state_table[INI] := []
- X put(state_table[INI],o_a_s(apply_FSTN,inter,0))
- X put(state_table[INI],o_a_s(any,tmp,inter))
- X MakeFSTN(l[i+2:0],inter,FIN)
- X return
- X }
- X }
- X }
- X else {
- X /state_table[INI] := []
- X put(state_table[INI],o_a_s(any,tmp,inter))
- X MakeFSTN(l[i+1:0],inter,FIN)
- X return
- X }
- X }
- X else { # I.E. l[1] not = Lb
- X every i := 1 to *l do {
- X case l[i] of {
- X Lb : {
- X inter := NextState()
- X MakeFSTN(l[1:i],INI,inter)
- X MakeFSTN(l[i:0],inter,FIN)
- X return
- X }
- X Rb : err_out(l,2,"]")
- X }
- X }
- X }
- X
- X # FIND INITIAL SEQUENCES OF POSITIVE INTEGERS, CONCATENATE THEM
- X if i := match_positive_ints(l) then {
- X inter := NextState()
- X tmp := Ints2String(l[1:i])
- X # if a slash has been encountered already, forget optimizing
- X # in this way; if parends are present, too, then forget it,
- X # unless we are at the beginning or end of the input string
- X if INI = 1 | FIN = 2 | /parends_present &
- X /slash_present & *tmp > *biggest_nonmeta_str
- X then biggest_nonmeta_str := tmp
- X /state_table[INI] := []
- X put(state_table[INI],o_a_s(match,tmp,inter))
- X MakeFSTN(l[i:0],inter,FIN)
- X return
- X }
- X
- X # OKAY, CLEAN UP ALL THE JUNK THAT'S LEFT
- X i := 0
- X while (i +:= 1) <= *l do {
- X case l[i] of {
- X Dot : { Op := any; Arg := &cset }
- X Dollar : { Op := pos; Arg := 0 }
- X Caret_outside: { Op := pos; Arg := 1 }
- X default : { Op := match; Arg := char(0 < l[i]) }
- X } | err_out(l,2,char(abs(l[i])))
- X inter := NextState()
- X if any('*+?',char(abs(0 > l[i+1]))) then {
- X case l[i+1] of {
- X -ord("*") : {
- X /state_table[INI] := []
- X put(state_table[INI],o_a_s(apply_FSTN,inter,0))
- X put(state_table[INI],o_a_s(Op,Arg,INI))
- X MakeFSTN(l[i+2:0],inter,FIN)
- X return
- X }
- X -ord("+") : {
- X inter2 := NextState()
- X /state_table[INI] := []
- X put(state_table[INI],o_a_s(Op,Arg,inter2))
- X /state_table[inter2] := []
- X put(state_table[inter2],o_a_s(apply_FSTN,inter,0))
- X put(state_table[inter2],o_a_s(Op,Arg,inter2))
- X MakeFSTN(l[i+2:0],inter,FIN)
- X return
- X }
- X -ord("?") : {
- X /state_table[INI] := []
- X put(state_table[INI],o_a_s(apply_FSTN,inter,0))
- X put(state_table[INI],o_a_s(Op,Arg,inter))
- X MakeFSTN(l[i+2:0],inter,FIN)
- X return
- X }
- X }
- X }
- X else {
- X /state_table[INI] := []
- X put(state_table[INI],o_a_s(Op,Arg,inter))
- X MakeFSTN(l[i+1:0],inter,FIN)
- X return
- X }
- X }
- X
- X # WE SHOULD NOW BE DONE INSERTING EVERYTHING INTO state_table
- X # IF WE GET TO HERE, WE'VE PARSED INCORRECTLY!
- X err_out(l,4)
- X
- Xend
- X
- X
- X
- Xprocedure NextState(new)
- X static nextstate
- X if \new then nextstate := 1
- X else nextstate +:= 1
- X return nextstate
- Xend
- X
- X
- X
- Xprocedure err_out(x,i,elem)
- X writes(&errout,"Error number ",i," parsing ",image(x)," at ")
- X if \elem
- X then write(&errout,image(elem),".")
- X else write(&errout,"(?).")
- X exit(i)
- Xend
- X
- X
- X
- Xprocedure zSucceed()
- X return .&pos
- Xend
- X
- X
- X
- Xprocedure Expand(s)
- X
- X s2 := ""
- X s ? {
- X s2 ||:= ="^"
- X s2 ||:= ="-"
- X while s2 ||:= tab(find("-")-1) do {
- X if (c1 := move(1), ="-",
- X c2 := move(1),
- X c1 << c2)
- X then every s2 ||:= char(ord(c1) to ord(c2))
- X else s2 ||:= 1(move(2), not(pos(0))) | err_out(s,2,"-")
- X }
- X s2 ||:= tab(0)
- X }
- X return s2
- X
- Xend
- X
- X
- X
- Xprocedure tab_bal(l,i1,i2)
- X i := 0
- X i1_count := 0; i2_count := 0
- X while (i +:= 1) <= *l do {
- X case l[i] of {
- X i1 : i1_count +:= 1
- X i2 : i2_count +:= 1
- X }
- X if i1_count = i2_count
- X then suspend i
- X }
- Xend
- X
- X
- Xprocedure match_positive_ints(l)
- X
- X # Matches the longest sequence of positive integers in l,
- X # beginning at l[1], which neither contains, nor is fol-
- X # lowed by a negative integer. Returns the first position
- X # after the match. Hence, given [55, 55, 55, -42, 55],
- X # match_positive_ints will return 3. [55, -42] will cause
- X # it to fail rather than return 1 (NOTE WELL!).
- X
- X every i := 1 to *l do {
- X if l[i] < 0
- X then return (3 < i) - 1 | fail
- X }
- X return *l + 1
- X
- Xend
- X
- X
- Xprocedure Ints2String(l)
- X tmp := ""
- X every tmp ||:= char(!l)
- X return tmp
- Xend
- X
- X
- Xprocedure StripChar(s,s2)
- X if find(s2,s) then {
- X tmp := ""
- X s ? {
- X while tmp ||:= tab(find("s2"))
- X do tab(many(cset(s2)))
- X tmp ||:= tab(0)
- X }
- X }
- X return \tmp | s
- Xend
- END_OF_FILE
- if test 20702 -ne `wc -c <'findre.icn'`; then
- echo shar: \"'findre.icn'\" unpacked with wrong size!
- fi
- # end of 'findre.icn'
- fi
- if test -f 'getkeys.icn' -a "${1}" != "-c" ; then
- echo shar: Will not clobber existing file \"'getkeys.icn'\"
- else
- echo shar: Extracting \"'getkeys.icn'\" \(1862 characters\)
- sed "s/^X//" >'getkeys.icn' <<'END_OF_FILE'
- X############################################################################
- X#
- X# Name: getkeys.icn
- X#
- X# Title: get keys for a gettext file
- X#
- X# Author: Richard L. Goerwitz
- X#
- X# Version: 1.1
- X#
- X############################################################################
- X#
- X# Getkeys(FNAME) generates all keys in FNAME in order of occurrence.
- X# See gettext.icn for a description of the requisite file structure
- X# for FNAME.
- X#
- X############################################################################
- X#
- X# Links: ./adjuncts.icn
- X# Requires: UNIX (maybe MS-DOS; untested)
- X# See also gettext.icn
- X#
- X############################################################################
- X
- X
- X# declared in adjuncts.icn
- X# global _slash, _baselen
- X
- Xprocedure getkeys(FNAME)
- X
- X local line, intext, start_unindexed_part
- X initial {
- X if /_slash then {
- X if find("UNIX", &features) then {
- X _slash := "/"
- X _baselen := 10
- X }
- X else if find("MS-DOS", &features) then {
- X _slash := "\\"
- X _baselen := 8
- X }
- X else stop("getkeys: OS not supported")
- X }
- X }
- X
- X /FNAME & stop("error (getkeys): null argument")
- X
- X # Try to open index file (there may not be one).
- X if intext := open(Pathname(FNAME) || getidxname(FNAME)) then {
- X # If there's an index file, then just suspend all the keys in
- X # it (i.e. suspend every line except the first, upto the tab).
- X # The first line tells how many bytes in FNAME were indexed.
- X # save it, and use it to seek to unindexed portions later on.
- X start_unindexed_part := integer(read(intext))
- X while line := read(intext) do
- X line ? suspend tab(find("\t")) \ 1
- X close(intext)
- X }
- X
- X intext := open(FNAME) | stop("getkeys: ",FNAME," not found")
- X seek(intext, \start_unindexed_part | 1)
- X while line := read(intext) do
- X line ? { suspend (="::", tab(0)) \ 1 }
- X
- X # Nothing left to suspend, so fail.
- X fail
- X
- Xend
- X
- END_OF_FILE
- if test 1862 -ne `wc -c <'getkeys.icn'`; then
- echo shar: \"'getkeys.icn'\" unpacked with wrong size!
- fi
- # end of 'getkeys.icn'
- fi
- if test -f 'gettext.icn' -a "${1}" != "-c" ; then
- echo shar: Will not clobber existing file \"'gettext.icn'\"
- else
- echo shar: Extracting \"'gettext.icn'\" \(6260 characters\)
- sed "s/^X//" >'gettext.icn' <<'END_OF_FILE'
- X############################################################################
- X#
- X# Name: gettext.icn
- X#
- X# Title: gettext (simple text-base routines)
- X#
- X# Author: Richard L. Goerwitz
- X#
- X# Version: 1.16
- X#
- X############################################################################
- X#
- X# Gettext() and associated routines allow the user to maintain a file
- X# of KEY/value combinations such that a call to gettext(KEY, FNAME)
- X# will produce value. Gettext() fails if no such KEY exists.
- X# Returns an empty string if the key exists, but has no associated
- X# value in the file, FNAME.
- X#
- X# The file format is simple. Keys belong on separate lines, marked
- X# as such by an initial colon+colon (::). Values begin on the line
- X# following their respective keys, and extend up to the next
- X# colon+colon-initial line or EOF. E.g.
- X#
- X# ::sample.1
- X# Notice how the key above, sample.1, has :: prepended to mark it
- X# out as a key. The text you are now reading represents that key's
- X# value. To retrieve this text, you would call gettext() with the
- X# name of the key passed as its first argument, and the name of the
- X# file in which this text is stored as its second argument (as in
- X# gettext("sample.1","tmp.idx")).
- X# ::next.key
- X# etc...
- X#
- X# For faster access, an indexing utility is included, idxtext. Idxtext
- X# creates a separate index for a given text-base file. If an index file
- X# exists in the same directory as FNAME, gettext() will make use of it.
- X# The index becomes worthwhile (at least on my system) after the text-
- X# base file becomes longer than 5 kilobytes.
- X#
- X# Donts:
- X# 1) Don't nest gettext text-base files.
- X# 2) Don't use spaces and/or tabs in key names.
- X# 3) Don't modify indexed files in any way other than to append
- X# additional keys/values (unless you want to re-index).
- X#
- X# This program is intended for situations where keys tend to have
- X# very large values, and use of an Icon table structure would be
- X# unweildy.
- X#
- X# BUGS: Gettext() relies on the Icon runtime system and the OS to
- X# make sure the last text/index file it opens gets closed.
- X#
- X# Note: This program is NOT YET TESTED UNDER DOS. In particular,
- X# I have no idea whether the indexing mechanism will work, due to
- X# translation that has to be done on MS-DOS text files.
- X#
- X############################################################################
- X#
- X# Links: ./adjuncts.icn
- X#
- X# Requires: UNIX (maybe MS-DOS; untested)
- X#
- X############################################################################
- X
- X# declared in adjuncts.icn
- X# global _slash, _baselen
- X
- Xprocedure gettext(KEY,FNAME)
- X
- X local line, value
- X static last_FNAME, intext, inidx
- X initial {
- X if find("UNIX", &features) then {
- X _slash := "/"
- X _baselen := 10
- X }
- X else if find("MS-DOS", &features) then {
- X _slash := "\\"
- X _baselen := 8
- X }
- X else stop("gettext: OS not supported")
- X }
- X
- X (/KEY | /FNAME) & stop("error (gettext): null argument")
- X
- X if FNAME == \last_FNAME then {
- X seek(intext, 1)
- X seek(\inidx, 1)
- X }
- X else {
- X # We've got a new text-base file. Close the old one.
- X every close(\intext | \inidx)
- X # Try to open named text-base file.
- X intext := open(FNAME) | stop("gettext: ",FNAME," not found")
- X # Try to open index file.
- X inidx := open(Pathname(FNAME) || getidxname(FNAME)) | &null
- X }
- X last_FNAME := FNAME
- X
- X # Find offsets for key KEY in index file. If inidx (the index
- X # file) is null (which happens when none was found), get_offsets()
- X # defaults to 1. Otherwise it returns the offset for KEY in the
- X # index file, and then returns the last indexed byte of the file.
- X # Returning the last indexed byte lets us seek to the end and do a
- X # sequential search of any key/value entries that have been added
- X # since the last time idxtext was run.
- X
- X seek(intext, get_offsets(KEY, inidx))
- X
- X # Find key. Should be right there, unless the user has appended
- X # key/value pairs to the end without re-indexing, or else has not
- X # bothered to index in the first place. In this case we're
- X # supposed to start a sequential search for KEY upto EOF.
- X
- X while line := (read(intext) | fail) do {
- X line ? {
- X if (="::", =KEY, pos(0))
- X then break
- X }
- X }
- X
- X # Collect all text upto the next colon+colon-initial line (::)
- X # or EOF.
- X value := ""
- X while line := read(intext) do {
- X match("::",line) & break
- X value ||:= line || "\n"
- X }
- X
- X # Note that a key with an empty value returns an empty string.
- X return trim(value, '\n')
- X
- Xend
- X
- X
- X
- Xprocedure get_offsets(KEY, inidx)
- X
- X local bottom, top, loc, firstpart, offset
- X # Use these to store values likely to be reused.
- X static old_inidx, firstline, SOF, EOF
- X
- X # If there's no index file, then just return an offset of 1.
- X if /inidx then
- X return 1
- X
- X # First line contains offset of last indexed byte in the main
- X # text file. We need this later. Save it. Start the binary
- X # search routine at the next byte after this line.
- X seek(inidx, 1)
- X if not (inidx === \old_inidx) then {
- X
- X # Get first line.
- X firstline := !inidx
- X # Set "bottom."
- X 1 = (SOF := where(inidx)-1) &
- X stop("get_offsets: corrupt .IDX file; reindex")
- X # How big is this file?
- X seek(inidx, 0)
- X EOF := where(inidx)
- X
- X old_inidx := inidx
- X }
- X # SOF, EOF constant for a given inidx file.
- X bottom := SOF; top := EOF
- X
- X # If bottom gets bigger than top, there's no such key.
- X until bottom > top do {
- X
- X loc := (top+bottom) / 2
- X seek(inidx, loc)
- X
- X # Move past next newline. If at EOF, break.
- X incr := 1
- X until reads(inidx) == "\n" do
- X incr +:= 1
- X if loc+incr = EOF then {
- X top := loc-1
- X next
- X }
- X
- X # Check to see if the current line contains KEY.
- X read(inidx) ? {
- X
- X # .IDX file line format is KEY\toffset
- X firstpart := tab(find("\t"))
- X if KEY == firstpart then {
- X # return offset
- X return (move(1), tab(0))
- X }
- X # Ah, this is what all binary searches do.
- X else {
- X if KEY << firstpart
- X then top := loc-1
- X else bottom := loc + incr + *&subject
- X }
- X }
- X }
- X
- X # First line of the index file contains offset of last indexed
- X # byte + 1. Might be the only line in the file (if it had no
- X # keys when it was indexed).
- X return firstline
- X
- Xend
- END_OF_FILE
- if test 6260 -ne `wc -c <'gettext.icn'`; then
- echo shar: \"'gettext.icn'\" unpacked with wrong size!
- fi
- # end of 'gettext.icn'
- fi
- if test -f 'idxtext.icn' -a "${1}" != "-c" ; then
- echo shar: Will not clobber existing file \"'idxtext.icn'\"
- else
- echo shar: Extracting \"'idxtext.icn'\" \(3452 characters\)
- sed "s/^X//" >'idxtext.icn' <<'END_OF_FILE'
- X############################################################################
- X#
- X# Name: idxtext.icn
- X#
- X# Title: idxtext (index text-base for gettext() routine)
- X#
- X# Author: Richard L. Goerwitz
- X#
- X# Version: 1.11
- X#
- X############################################################################
- X#
- X# Idxtext turns a file associated with gettext() routine into an
- X# indexed text-base. Though gettext() will work fine with files
- X# that haven't been indexed via idxtext(), access is faster if the
- X# indexing is done if the file is, say, over 10k (on my system the
- X# crossover point is actually about 5k).
- X#
- X# Usage is simply "idxtext [-a] file1 [file2 [...]]," where file1,
- X# file2, etc are the names of gettext-format files that are to be
- X# (re-)indexed. The -a flag tells idxtext to abort if an index file
- X# already exists.
- X#
- X# Indexed files have a very simple format: keyname tab offset
- X# [tab offset [etc.]]\n. The first line of the index file is a
- X# pointer to the last indexed byte of the text-base file it indexes.
- X#
- X# BUGS: Index files are too large. Also, I've yet to find a portable
- X# way of creating unique index names that are capable of being
- X# uniquely identified with their original text file. It might be
- X# sensible to hard code the name into the index. The chances of a
- X# conflict seem remote enough that I haven't bothered. If you're
- X# worried, use the -a flag.
- X#
- X############################################################################
- X#
- X# Links: ./adjuncts.icn
- X# Requires: UNIX or MS-DOS
- X# See also: gettext.icn
- X#
- X############################################################################
- X
- X
- X# declared in adjuncts.icn
- X# global _slash, _baselen
- X
- Xprocedure main(a)
- X
- X local ABORT, idxfile_name, fname, infile, outfile
- X initial {
- X if find("UNIX", &features) then {
- X _slash := "/"
- X _baselen := 10
- X }
- X else if find("MS-DOS", &features) then {
- X _slash := "\\"
- X _baselen := 8
- X }
- X else stop("idxtext: OS not supported")
- X }
- X
- X if \a[1] == "-a" then ABORT := pop(a)
- X
- X # Check to see if we have any arguments.
- X *a = 0 & stop("usage: idxtext [-a] file1 [file2 [...]]")
- X
- X # Start popping filenames off of the argument list.
- X while fname := pop(a) do {
- X
- X # Open input file.
- X infile := open(fname) |
- X { write(&errout, "idxtext: ",fname," not found"); next }
- X # Get index file name.
- X idxfile_name := Pathname(fname) || getidxname(fname)
- X if \ABORT then if close(open(idxfile_name)) then
- X stop("idxtext: index file ",idxfile_name, " already exists")
- X outfile := open(idxfile_name, "w") |
- X stop("idxtext: can't open ", idxfile_name)
- X
- X # Write index to index.IDX file.
- X write_index(infile, outfile)
- X
- X every close(infile | outfile)
- X
- X }
- X
- Xend
- X
- X
- Xprocedure write_index(in, out)
- X
- X local key_offset_table, w, line, KEY
- X
- X # Write to out all keys in file "in," with their byte
- X # offsets.
- X
- X key_offset_table := table()
- X
- X while (w := where(in), line := read(in)) do {
- X line ? {
- X if ="::" then {
- X KEY := trim(tab(0))
- X if not (/key_offset_table[KEY] := KEY || "\t" || w)
- X then stop("idxtext: duplicate key, ",KEY)
- X }
- X }
- X }
- X
- X # First line of index contains the offset of the last
- X # indexed byte in write_index, so that we can still
- X # search unindexed parts of in.
- X write(out, where(in))
- X
- X # Write sorted KEY\toffset lines.
- X if *key_offset_table > 0 then
- X every write(out, (!sort(key_offset_table))[2])
- X
- X return
- X
- Xend
- END_OF_FILE
- if test 3452 -ne `wc -c <'idxtext.icn'`; then
- echo shar: \"'idxtext.icn'\" unpacked with wrong size!
- fi
- # end of 'idxtext.icn'
- fi
- if test -f 'jarg2get.icn' -a "${1}" != "-c" ; then
- echo shar: Will not clobber existing file \"'jarg2get.icn'\"
- else
- echo shar: Extracting \"'jarg2get.icn'\" \(1653 characters\)
- sed "s/^X//" >'jarg2get.icn' <<'END_OF_FILE'
- X############################################################################
- X#
- X# Name: 1.1
- X#
- X# Title: jargon to gettext format converter
- X#
- X# Author: Richard L. Goerwitz
- X#
- X# Version: jarg2get.icn
- X#
- X############################################################################
- X#
- X# Converts jargon.ascii (stdin) to a format suitable for use by gettext.
- X# Writes to stdout. Jargon.ascii was posted recently (c. March 1, 1991)
- X# to alt.sources.
- X#
- X############################################################################
- X
- Xprocedure main()
- X
- X local line, KEY, key_set, no, yes, blank_count
- X
- X blank_count := 0
- X key_set := set()
- X no := &ucase || "-/"; yes := &lcase || " "
- X # Isn't goal-directed evaluation nice?
- X (match("= A =", !&input), "" == !&input)
- X
- X # Read stdin, looking for entries. Entries can be distinguished
- X # a) by a preceding blank line, and b) by the presence of charac-
- X # ters beginning immediately at the margin, and c) by the presence
- X # of a colon plus a space on the line.
- X while line := trim(read(), '\t \xFF\r') do {
- X
- X if "" == line then {
- X if (blank_count +:= 1) > 2
- X then exit(0)
- X else write()
- X }
- X else {
- X line ? {
- X if match("Hacker Folklore"|"Appendix A: ")
- X then exit(0)
- X if blank_count > 0 &
- X KEY := map(tab(any(&letters)) || tab(find(": ")),no,yes)
- X then {
- X KEY := trim(KEY,' :')
- X if not member(key_set, KEY)
- X then write("::", KEY)
- X insert(key_set, KEY)
- X }
- X (="= ", tab(any(&ucase)), =" =", !&input) | write(line)
- X }
- X blank_count := 0
- X }
- X }
- X
- X stop("jarg2get: aborting (are you sure you have the correct file?)")
- X
- Xend
- END_OF_FILE
- if test 1653 -ne `wc -c <'jarg2get.icn'`; then
- echo shar: \"'jarg2get.icn'\" unpacked with wrong size!
- fi
- # end of 'jarg2get.icn'
- fi
- if test -f 'jargon.src' -a "${1}" != "-c" ; then
- echo shar: Will not clobber existing file \"'jargon.src'\"
- else
- echo shar: Extracting \"'jargon.src'\" \(3147 characters\)
- sed "s/^X//" >'jargon.src' <<'END_OF_FILE'
- X############################################################################
- X#
- X# Name: 1.10
- X#
- X# Title: look up words in hackers' jargon database
- X#
- X# Author: Richard L. Goerwitz
- X#
- X# Version: jargon.icn
- X#
- X############################################################################
- X#
- X# Defines hackers' jargon. Usage is simply "jargon word," where word
- X# is some bit of hacker's slang for which a definition is desired.
- X# Aborts with an exit code of 1 on no-arg invocation. If a "word"
- X# arg is given, but no definition is found, jargon exits with status
- X# 2. Otherwise the appropriate entry for "word" is displayed. If
- X# you aren't sure of precisely what word you are looking for, then
- X# you can type "jargon -p pattern" (where pattern is an egrep-style
- X# regular expression). Jargon will provide a list of entries con-
- X# taining pattern on the standard output. Note that this option will
- X# probably be very slow for non-UNIX installations.
- X#
- X# Tested on the jargon file, version 2.7.1 (posted to alt.sources on
- X# March 1, 1991). Tested also on the 2.8.3 version FTPed from one or
- X# another archive site that I don't recall offhand. It may work on
- X# other versions as well.
- X#
- X############################################################################
- X#
- X# Links: gettext.icn, adjuncts.icn (getkeys.icn, findre.icn)
- X#
- X############################################################################
- X
- Xprocedure main(a)
- X
- X local n, usage, no, yes, is_UNIX, firstarg, pat, cmd, entry
- X
- X # Change this, if you use a different location.
- X n := "/usr/local/lib/jargon/jargon.wrd"
- X is_UNIX := find("UNIX", &features)
- X
- X no := &ucase || "-/"; yes := &lcase || " "
- X usage := "<usage> ::= <progname> <arguments> \n_
- X <progname> ::= \"jargon\" \n_
- X <arguments> ::= <word> | \"-p\" <pattern>"
- X firstarg := pop(a) | stop(usage)
- X
- X if firstarg == "-p" then {
- X
- X # User wants to see a list of entries which match a pattern.
- X pat := map(pop(a), no, yes) | stop(usage)
- X # If there are still more arguments, the user has screwed up.
- X *a = 0 | stop(usage)
- X
- X # Search for pat in the list of entries. If running under UNIX,
- X # egrep the index file. Otherwise, use an Icon-only egrep sub-
- X # stitute (slow).
- X if \is_UNIX then {
- X # Ah, UNIX. Use the system egrep command to match pat.
- X _slash := "/"; _baselen := 10
- X cmd := "egrep '"|| pat ||".*\t' "|| Pathname(n) || getidxname(n)
- X in := open(cmd, "pr") | stop("error (main): can't egrep IDX file")
- X every entry := !in do
- X entry ? write(1(tab(find("\t")+1), tab(many(&digits)), pos(0)))
- X close(in)
- X }
- X else {
- X # Not UNIX. Use (slow) Icon-only regexp handler.
- X every entry := getkeys(n) do {
- X if findre(pat, entry) then
- X write(entry)
- X }
- X }
- X # If any entries contained pat, then exit with zero status.
- X if \entry then exit(0)
- X else exit(2)
- X }
- X
- X # Firstarg is not -p, and the sole argument must name an entry the
- X # user wants retrieved.
- X else {
- X # If we still have elements in a, the user has screwed up.
- X *a = 0 | stop(usage)
- X write(gettext(trim(map(firstarg, no, yes)), n)) | exit(2)
- X exit(0)
- X }
- X
- Xend
- END_OF_FILE
- if test 3147 -ne `wc -c <'jargon.src'`; then
- echo shar: \"'jargon.src'\" unpacked with wrong size!
- fi
- # end of 'jargon.src'
- fi
- echo shar: End of archive 1 \(of 1\).
- cp /dev/null ark1isdone
- MISSING=""
- for I in 1 ; do
- if test ! -f ark${I}isdone ; then
- MISSING="${MISSING} ${I}"
- fi
- done
- if test "${MISSING}" = "" ; then
- echo You have the archive.
- rm -f ark[1-9]isdone
- else
- echo You still must unpack the following archives:
- echo " " ${MISSING}
- fi
- exit 0
- exit 0 # Just in case...
- --
- Kent Landfield INTERNET: kent@sparky.IMD.Sterling.COM
- Sterling Software, IMD UUCP: uunet!sparky!kent
- Phone: (402) 291-8300 FAX: (402) 291-4362
- Please send comp.sources.misc-related mail to kent@uunet.uu.net.
-